Ramp-up is a significant bottleneck for the introduction\udof new or adapted manufacturing systems. The effort\udand time required to ramp-up a system is largely dependent on\udthe effectiveness of the human decision making process to select\udthe most promising sequence of actions to improve the system to\udthe required level of performance. Although existing work has\udidentified significant factors influencing the effectiveness of rampup,\udlittle has been done to support the decision making during\udthe process. This paper approaches ramp-up as a sequential\udadjustment and tuning process that aims to get a manufacturing\udsystem to a desirable performance in the fastest possible time.\udProduction stations and machines are the key resources in a\udmanufacturing system. They are often functionally decoupled\udand can be treated in the first instance as independent rampup\udproblems. Hence, this paper focuses on developing a Markov\uddecision process (MDP) model to formalize ramp-up of production\udstations and enable their formal analysis. The aim is to\udcapture the cause-and-effect relationships between an operator’s\udadaptation or adjustment of a station and the station’s response to\udimprove the effectiveness of the process. Reinforcement learning\udhas been identified as a promising approach to learn from rampup\udexperience and discover more successful decision-making\udpolicies. Batch learning in particular can perform well with little\uddata. This paper investigates the application of a Q-batch learning\udalgorithm combined with an MDP model of the ramp-up process.\udThe approach has been applied to a highly automated production\udstation where several ramp-up processes are carried out. The\udconvergence of the Q-learning algorithm has been analyzed\udalong with the variation of its parameters. Finally, the learned\udpolicy has been applied and compared against previous ramp-up\udcases.
展开▼